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Abstract: Exposure to endogenous sex hormones has been reported as a risk factor for 
breast cancer. The CYPllAl gene encodes the key enzyme that catalyzes the initial and 
rate-limiting step in steroid hormone synthesis. In this study, the associations between 
single nucleotide polymorphisms (SNPs) in CYPllAl and breast cancer susceptibility were 
examined. Six SNPs in CYPllAl were genotyped using the Mass ARRAY IPLEX platform 
in 530 breast cancer patients and 546 healthy controls. Association analyses based on a x 
test and binary logistic regression were performed to determine the odds ratio (OR) and 
95% confidence interval (95% CI) for each SNP. Two loci (rs2959008 and rs2279357) 
showed evidence of associations with breast cancer risk. The variant genotype C/T-C/C of 
rs2959008 was significantly associated with a decreased risk (age-adjusted OR, 0.75; 95% 
CI, 0.58-0.96; P = 0.023) compared with the wild-type TT. However, the homozygous TT 
variant of rs2279357 exhibited increased susceptibility to breast cancer (age-adjusted OR, 
1.44; 95% CI, 1.05-1.98; P = 0.022). The locus rs2959003 also showed an appreciable 
effect, but no associations were observed for three other SNPs. Our results suggest that 
polymorphisms of CYPllAl are related to breast cancer susceptibility in Han Chinese 
women of South China. 
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1. Introduction 

Breast cancer is one of the most common malignancies among women worldwide and the incidence 
rate is increasing. Compared with Western countries, the peak age of breast cancer incidence is much 
earlier and the mortality rate is increasing in Asian populations [1]. Breast cancer is a heterogeneous 
disease caused by multiple genetic and environmental factors. Many candidate genes that may cause 
breast cancer have been identified in the past few years including high penetrance genes (BRCAl/2, 
TP53 and PTEN), moderate penetrance genes (ATM, CHEK2, PALB2 and BRIP), and low penetrance 
genes (FGFR2, ESRl and T0X3) [2-7]. Inherited mutations among those genes predispose to high 
risks of breast cancer. Life time risks of breast cancer among BRCAl/2 mutation carriers are 82% [8]. 
Mutations of CHEK2, PALB2, and BRIPl have a 2.34, 2.3, and 2.0 fold increase risk for breast cancer 
compared to the normal population, respectively [5-7]. Numerous genome- wide association studies 
(GWAS) also have identified approximately fifty common genetic loci for breast cancer risk in low 
penetrance genes such as FGFR2, ESRl, LSPl, MAP3K1, RAD51L1 and T0X3 [9-13]. Although each 
locus confers no more than 1.4 odds ratio, a combination of these variants may act cumulatively to 
increase breast cancer risk. As more breast cancer susceptibility genes of different penetrances are 
identified, more appropriate genetic tests and risk reduction strategies can be developed. 

Previous reports indicate that prolonged exposure to endogenous sex hormones, especially 
estrogens and progestogens, can increase the risk of breast cancer [14,15]. High levels of endogenous 
sex hormones are considered crucial factors associated with breast cancer susceptibility. Several 
case-control studies have also shown evidences that polymorphisms in steroid hormone biosynthesis 
genes are associated with breast cancer susceptibility [16-19]. The CYPllAl (cj^ochrome P-450 llAl) 
gene is located at 15q23-q24 and consists of nine exons spanning a total of 29,864 bp. This gene 
encodes the cholesterol side chain cleavage enzyme P450scc, a member of the cytochrome P450 
superfamily of enzymes, which resides on the mitochondrial inner membrane and catalyzes the 
conversion of cholesterol to pregnenolone, the initial and rate-limiting step in steroid hormone 
synthesis [20]. CYPllAl is primarily expressed in steroidogenic tissues, such as the adrenal cortex, 
gonads, and placenta. Although P450scc is always active, genetic variants of CYPllAl may alter its 
expression and activities, and thus result in certain hormone-related diseases. Polymorphisms of 
CYPllAl have been detected as potential markers in different hormone-dependent diseases, including 
breast cancer [16-19], polycystic ovary syndrome (PCOS) [21], prostate cancer [22,23], and 
endometrial cancer [24] . Genetic architecture is different among populations and the data reported to 
date concentrate mainly on Western populations or pentanucleotide [(TAAAA)n] repeat polymorphisms. 
Here, we investigated the associations between single nucleotide polymorphisms (SNPs) in CYPllAl 
and breast cancer among Han Chinese women in Guangdong province, South China. Six SNPs in 
CYPllAl (rs2959008, rs2959003, rs2279357, rsl 1638442, rs2073475, and rsl6968478) were 
genotyped to perform a case-control study in women from Guangdong province. 
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2. Results and Discussion 

2.1. Study Subjects 

All subjects included in this study were Han Chinese women. The mean ages of the patients and 
controls were 48.48 + 10.18 and 44.59 + 11.26 years, respectively. An independent- sample f-test 
indicated a significant difference in age distribution between the two groups {P < 0.05, data not shown), 
so all statistical analyses were subsequently adjusted by age. 

2.2. Hardy-Weinberg Equilibrium (HWE), Linkage Disequilibrium (LD) and Haplotype Analysis 

All SNPs conformed to Hardy-Weinberg equilibrium (HWE) among both cases and controls 
{P > 0.05) with minor allele frequency (MAP) > 0.22 (Table 1). The SNPs rsl 1638442, rs2073475, 
and rsl 6968478 were in high or complete linkage disequilibrium as well as rs2959003 and rs2279357 
(supplementary material). D' values are shown in Figure 1. The SNP rs2959008 had a low LD 
compared to the others. 

Table 1. Genotype characteristics of the six single nucleotide polymorphisms (SNPs). 



SNP 


Alleles 


MAF" 


HWE" 


(P-value) 


Case 


Control 


rs2959008 


C/T 


0.36/C 


0.976 


0.587 


rs2959003 


C/T 


0.38/C 


0.852 


0.519 


rs2279357 


C/T 


0.43/T 


0.403 


0.803 


rs 11 63 8442 


C/G 


0.22/G 


0.659 


0.674 


rs2073475 


A/G 


0.44/A 


0.951 


0.765 


rsl6968478 


A/G 


0.46/G 


0.834 


0.732 



" Minor Allele Frequency; Hardy-Weinberg equilibrium. 

Figure 1. Linkage disequilibrium (LD) pattern among six SNPs by Haploview analysis. 
Numbers inside the boxes represent D' values for LD, and the miss number is 100. 




Int. J. Mol. Sci. 2012, 13 



4899 



Six main haplotypes of CYPllAl were considered in the analysis for all subjects. The results of 
individual haplotype analysis and breast cancer are shown in Table 2. The haplot5^e TTTCAG was 
most common in both case and control groups. The distribution frequency of haplotype CCCCGA was 
significantly different between the two groups and protective against the development of breast cancer 
{OR, 0.64; 95% CI, 0.48-0.86; P = 0.0033). 



Table 2. Association between haplot5^es and breast cancer for six SNPs in CYPllAl. 



Haplotype" 


Total 


Frequencies 
Control 


Case 


— OR {95% Clt 


P-value 


TTTCAG 


0.2936 


0.2755 


0.314 


1 




CCCGGA 


0.153 


0.1567 


0.1495 


0.82 (0.62-1.08) 


0.15 


CCCCGA 


0.1309 


0.1495 


0.1111 


0.64 (0.48-0.86) 


0.0033 


TTCCAG 


0.1273 


0.1333 


0.1203 


0.78 (0.58-1.06) 


0.11 


TTTCGA 


0.1177 


0.1079 


0.1269 


1.03 (0.76-1.40) 


0.83 


TCCCGA 


0.0673 


0.071 


0.0638 


0.83 (0.57-1.20) 


0.32 


rare 


0.1102 


0.1061 


0.1145 


0.92 (0.67-1.28) 


0.64 


Haplotypes were 


constructed 


for rs2959008, 


rs2959003, 


rs2279357, rsl 1638442, 


rs2073475 a 



rsl6968478; " OR (95% CI) was adjusted for age and the bold values indicate P < 0.05. 
2.3. Polymorphisms of CYPllAl and Breast Cancer Risk 

The genotype distributions of rs2959008 and rs2279357 were significantly different between cases 
and controls in the chosen genetic model (Table 3). For rs2959008, the variant genotype C/T-C/C was 
significantly associated with a decreased risk of breast cancer {OR, 0.75; 95% CI, 0.58-0.96; 
P = 0.023) compared with the wild-type TT. Individuals carrying the C allele showed protection from 
breast cancer {OR, 0.80; 95% CI, 0.67-0.96; P = 0.018). In contrast to rs2959008, the homozygous TT 
variant and allele T of rs2279357 were associated with elevated susceptibility to breast cancer 
{OR, 1.44; 95% CI, 1.05-1.98; P = 0.022 and OR, 1.23; 95% CI, 1.04^1.47; P = 0.018, respectively). 
A correlation between the rs2959003 polymorphism and breast cancer was also observed under the 
log-additive model. Three other SNPs (rsl 1638442, rs2073475, and rsl6968478) did not show 
significant differences between the groups in the present study (supplementary material). 

A further association analysis was conducted to identify the interactions of two susceptibility-associated 
SNPs, rs2959008 and rs2279357, and their impact on the risk of breast cancer. The results indicated that 
the united genotj^e TT-TT of rs2959008 and rs2279357 showed increased breast cancer risk {OR, 
1.65; 95% CI, 1.18-2.31; P = 0.003) (Table 4). There was also a marginal association between 
genotype CT/CC-TT and reduced breast cancer risk {OR, 0.12; 95% CI, 0.02-0.94; P = 0.044). 

Breast cancer risk is partially determined by endogenous hormone levels [25]. Given the important 
role of P450scc in steroid sex hormone biosynthesis, this case-control study was performed among 
Han Chinese women in Guangdong province. All six SNPs (rs2959008, rs2959003, rs2279357, 
rsl 1638442, rs2073475 and rsl6968478) are located in noncoding regions of CYPllAl, with the first 
four in introns and the latter two upstream of the gene. The results showed that rs2959008 and 
rs2279357 of the CYPllAl gene were strongly associated with breast cancer risk, as well as the 
interaction of the two SNPs. 
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Table 3. The genotype distributions (%) and association of breast cancer risk. 





Model 


Genotype 


Control 


Case 


(/« (yS % Ci ) 


P-value 


rs2959008 




T/T 


205 (37.5%) 


229 (43.2%) 


1 






Codominant 


C/T 


264 (48.4%) 


239 (45.1%) 


0.77 (0.59-1.00) 


0.057 








11 (14.1%) 


52 (11./%) 


U.o7 (U.45— U.W) 






Dominant 


T/T 


205 (37.5%) 

(^uZ. JVO ^ 


229 (43.2%) 

JUl (^Ju.oyo ) 


1 

ft 7? /ft ?8__n o#:\ 

U. /3 (,U.3o— U.yO^ 


0.023 




Recessive 


T/T /^/T 

1/1 - w 1 


/I^O /Q^ OCJ/ \ 


4bo (oo.ivo ) 


1 
1 


0.16 






/ / (14. 1%) 


62 (11./%) 


A TT /A 1 1 1 \ 

0.// (0.53—1.11) 




Overdominant 


T/r-c/c 

/T* 

C/1 


282 (51.6%) 
264 (48.4%) 


291 (54.9%) 
239 (45.1%) 


1 

n OC /A 1 AA\ 

0.85 (0.67-1.09) 


0.2 




Log-additive 








0.80 (0.67-0.96) 


0.018 


rs2959003 




T/T 


196 (36%) 


218 (41.3%) 


1 






Codominant 


C/T 


255 (46.9%) 


241 (45.6%) 


0.84 (0.64-1.10) 


0.092 






L/C 


93 (17.1%) 


69 (13.1%) 


0.67 (0.46-0.97) 






Dominant 


T/T 
w 1 -wV^ 


196 (36%) 
D'\o \ Ky\ /o ) 


218 (41.37o) 
jlU (^JO. / yoj 


1 


0.072 




Recessive 


T/T r^rr 
1/1-L./1 

c/c 


4j1 (oz.yyoj 
93 (17.1%) 


4jy (po.y /c) 
69 (13.1%) 


1 

A T /I /A CO 1 A/(\ 

U.74 (U.52-1.U4) 


0.078 




Overdominant 


T/T r^ir^ 
1/1 -C/C 

/T 

C/1 


ZoV (j)3.1yo) 
255 (46.9%) 


IQH t^A AC7 \ 
'~\ A 1 A c £in/ \ 

241 (45.6%) 


1 

A A /I /A '1 A 1 ^A\ 

0.94 (0.74-1.20) 


0.62 




Log-additive 








o.oz (o.oy— u.yo) 


0.03 


rs2219?>51 




c/c 


193 (35.4%) 


164 (30.9%) 


1 






Codominant 


/^/T 

C/ 1 


ZOJ (4o.O%) 


//1 1 no7 \ 
IjD {4/Jvc) 


11/1 /A 07 1 <A\ 

1.14 (O.o /— 1.50) 


0.048 






T/T 


87 (16%) 


1 1 O /O 1 O /7/ \ 

113 (21.3%) 


1.56 (1.09-2.22) 






Dominant 




1 0'2 /''2< Act \ 

lyj (jj.4yoJ 


lo4 (jU.y /c) 


1 
1 


0.1 




C/T-T/T 


352 (64.6%) 


366 (69.1%) 


1.24(0.96-1.61) 




Recessive 


C/C-C/T 


458 (84%) 


417 (78.7%) 


1 


0.022 




T/T 


87 (16%) 


113 (21.3%) 


1.44 (1.05-1.98) 




Overdominant 


C/C-T/T 


280 (51.4%) 


277 (52.3%) 


1 


0.82 




C/T 


265 (48.6%) 


253 (47.7%) 


0.97 (0.76 1.24) 




Log-additive 








1.23 (1.04-1.47) 


0.018 



* OR (95% CI) was adjusted for age and the bold values indicate P < 0.05. 
Table 4. Interaction for rs2959008 and rs2279357 between case and control group. 



Genotype 


Control 


Case P-value 


OR{9S% Clf 


P-value 


TT-TT 


77(14.1%) 


112(21.1%) 


1.65 (1.18-2.31) 


0.003 


TT-CT/CC 


127 (23.3%) 


117(22.1%) 


1.10(0.81-1.48) 


0.551 




15.57 0.001 




CT/CC-TT 


10(1.8%) 


1 (0.2%) 


0.12 (0.02-0.94) 


0.044 


CT/CC-CT/CC 


331 (60.7%) 


300 (56.6%) 


1 





* OR (95% CI) was adjusted for age and the bold values indicate P < 0.05. 

Numerous studies have examined CYPIIAI gene polymorphisms. The best studied region is the 
pentanucleotide [TTTTAJn repeat (D15S520) located 528 bp upstream of the translational initiation 
site, which showed six common polymorphisms with four, six, seven, eight, nine, or ten [TTTTA] 
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repeats. In the Chinese population, the main variants are the four-, six-, and eight-repeat alleles, which 
are correlated with risks of breast cancer [16,17] and polycystic ovarian syndrome [21]. The [TTTTAJ4 
and [TTTTAJe alleles were also shown to be associated with prostate cancer in other ethnic 
groups [22,23]. However, the [(TAAAA)n] polymorphism was different from common biallelic 
variations in the present study. 

Setiawan et al. systematically evaluated genetic variations spanning 67 kb of CYPllA and 
suggested that common variations in CYPllA may be related to breast cancer susceptibility in a 
western Multiethnic Cohort Study, but no particular SNP or haplotype was identified as the most likely 
causal variant [18]. Mammographic density has also been shown to be a breast cancer risk factor, and 
is strongly associated with hormone exposure profile [26]. In Swedish women, our five tag SNPs 
(rs2959008, rs2959003, rs2279357, rsl 1638442, and rsl6968478) were significantly related to 
mammographic density in single SNP analysis, but the relationship disappeared after correction for 
multiple testing [27]. Another three tag SNPs (rs4555110, rs3825944, and rs7173655) were 
significantly associated with increased (1.3-1.4-fold) endometrial cancer risk in women in Boston, 
USA [24]. Along with another 11 loci, polymorphisms of CYPllAl were also confirmed to be 
associated with hypertension and blood pressure (BP) in the Japanese population, with the presence of 
more of these alleles indicating higher risk [28]. These results were mainly based on Western 
populations. Haplotype analysis indicated that the promoter haplotype of CYPllAl was associated 
with increased risk of breast cancer among Chinese women in Shanghai, but three of our SNPs 
(rs2959008, rs2959003, and rsl6968478) were not included [19]. Here, we focused on the Han 
Chinese population in Guangdong province and identified breast cancer-related SNPs. Six SNPs that 
have seldom been reported in the Han Chinese population were selected to verify the association. The 
results suggested that the haplotype CCCCGA had a protective effect against breast cancer. Two single 
SNPs, rs2959008 and rs2279357, were significantly associated with breast cancer risk and interaction 
analysis also demonstrated that the united genotj^e TT-TT of these two SNPs showed much higher 
risk of breast cancer. 

3. Experimental Section 

3.1. Subjects 

The subjects participated in our study were recruited between April 2009 and October 2011 from 
Nanfang Hospital, Southern Medical University, Guangzhou, Guangdong Province, China. All 
participants were permanent residents of Guangdong from the Han Chinese population. CUnical 
information for each subject was collected from medical records. 

3.1.1. Patients of Breast Cancer 

A total of 621 female patients with breast cancer were recruited after confirmed by pathological 
diagnosis through breast clinics. 550 (88.6%) of them were approached with their written informed 
consent and blood samples were collected. The age ranged from 22 to 80 years old. Other 71 patients 
refused to provide blood samples. Blood samples from 14 patients were unusable for DNA extraction 
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due to transportation and storage at -70 °C After genotyping, 6 samples were excluded as all six SNPs 
could not be determined. Therefore, 530 case subjects were included in the final statistical analysis. 

3.1.2. Controls 

550 healthy control women were randomly selected from outpatients ranging in age from 16 to 
84 years old during the same period with no history of cancer or other breast-related diseases as 
determined by molybdenum target mammography and color Doppler ultrasonography and blood 
samples were collected. However, 4 samples failed in genotyping and 546 healthy controls were 
included in the final statistical analysis. 

The study protocol was approved by the Clinical Research Ethics Committee of Nanfang Hospital 
and written informed consent was obtained from all participants. 

3.2. DNA Extraction 

Peripheral blood samples were collected from the participants and stored at -70 °C until DNA 
extraction. Genomic DNA was extracted from peripheral blood samples using an E.Z.N.A.^m blood 
DNA kit (Omega Bio-Tek, Norcross, GA, USA) according to the manufacturer's protocol. 

3.3. Genotyping 

For each SNP, a pair of amplification primers and an extension primer was designed using Assay 
Design 3.1 software (Sequenom, San Diego, CA, USA). Genotypes were generated using the 
SEQUENOM MassARRAY matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) 
mass spectrometry platform (Sequenom, San Diego, CA, USA) according to the manufacturer's 
instructions. This is one of the most powerful tools for detecting insertions, deletions, substitutions, 
and other polymorphisms in amplified DNA, and allows rapid, efficient, and high-throughput detection 
without any interactions. The overall call rates ranged from 99.63% to 100%. 

3.4. Statistical Analysis 

Hardy-Weinberg equilibrium (HWE) and linkage disequilibrium (LD) for the six SNPs were 
calculated using Haploview 4.2 (Daly Lab, Cambridge, MA, USA). The genotype and allele 
distributions as well as the interactions of SNPs between case and control subjects were compared. 
Multiple inheritance models (codominant, dominant, recessive, overdominant, and log-additive) were 
chosen to evaluate the associations between each SNP and breast cancer risk. The odds ratio (OR) and 
95% confidence interval (95% CI) were evaluated by binary logistic regression analyses. There was a 
significant difference in age distribution between case and control groups as described previously, so 
the age was adjusted by setting as a covariate in association analyses for each SNP. Statistical analyses 
were implemented using SPSS 13.0 software (SPSS, Chicago, IL, USA) and the Web-based tool 
SNPstats [29]. All comparisons were two-sided and P < 0.05 was regarded as statistically significant. 
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4. Conclusions 

In summary, our results showed that rs2959008 and rs2279357 of the CYPllAl gene were 
significantly associated with breast cancer susceptibility among Guangdong Han Chinese women, and 
the interaction of these two SNPs was associated with elevated risk. The haplotype CCCCGA of our 
six SNPs also had a protective effect against the development of breast cancer. The biological 
mechanisms behind these associations remain unknown. Therefore, further comprehensive 
investigations of steroid hormone biosynthesis and metabolism gene variations combined with other 
risk factors are required to identify biomarkers for inherited breast cancer susceptibility. 
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